Add troubleshooting guide for certificate re-issuance loops#2128
Add troubleshooting guide for certificate re-issuance loops#2128wallrj-cyberark wants to merge 1 commit into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
✅ Deploy Preview for cert-manager ready!Built without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify project configuration. |
wallrj-cyberark
left a comment
There was a problem hiding this comment.
Self-review of the new troubleshooting page.
Scope: Four files changed — all documentation, no code. New troubleshooting page covering two main scenarios (external Secret manager conflicts, duplicate spec.secretName), plus navigation updates in manifest.json and the troubleshooting index.
Checks: check:eslint, check:stylelint, check:markdown (remark), and check:spelling all pass. The check:links failures are pre-existing (mailto: example addresses in ingress.md, gateway.md, annotations.md).
Netlify preview: the page renders correctly with admonitions, code blocks, and the table of contents sidebar.
Generated with Claude (Opus 4.6)
| 1. cert-manager successfully issues a certificate and writes the key material | ||
| to the target Secret. | ||
| 2. The external controller overwrites or patches the Secret (e.g. on its next | ||
| sync interval), replacing the private key and/or certificate data. |
There was a problem hiding this comment.
The SecretPublicKeyDiffersFromCurrentCertificateRequest function name is taken directly from the cert-manager source at internal/controller/certificates/policies/checks.go:189. Including it here helps users who search logs or source for this identifier.
| sync interval), replacing the private key and/or certificate data. | ||
| 3. cert-manager detects that the private key in the Secret no longer matches | ||
| the CSR in the current CertificateRequest | ||
| (`SecretPublicKeyDiffersFromCurrentCertificateRequest`). |
There was a problem hiding this comment.
The key insight for this scenario: backoff counters reset on successful issuance, so the loop runs without delay. This is not a bug — it is correct behaviour — but it means that the external write triggers unbounded re-issuance. The shouldBackoffReissuingOnFailure function in pkg/controller/certificates/trigger/trigger_controller.go:287 only backs off when status.lastFailureTime is set.
| ### Diagnosis | ||
|
|
||
| Check for duplicate `spec.secretName` values across Certificate resources: | ||
|
|
There was a problem hiding this comment.
The uniq -d -f1 approach finds duplicates by the second tab-separated field (secretName). This will correctly flag cases where two Certificates in different namespaces share a secretName, though those cannot actually conflict (Secrets are namespace-scoped). A reviewer might suggest scoping this to per-namespace duplicates only, but the broader output is arguably more useful as a diagnostic aid.
3ee6e58 to
dcdfa47
Compare
53e78e9 to
888ab6c
Compare
- Document the root cause of infinite re-issuance loops when an external controller (e.g. External Secrets Operator) overwrites a cert-manager-managed Secret - Document the duplicate spec.secretName variant where two Certificate resources target the same Secret - Include symptoms, diagnosis steps, and fixes for each scenario - Reference GitHub issues #4846, #5675, #6988, #6992, #8380 Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Richard Wall <richard.wall@cyberark.com>
888ab6c to
20bba20
Compare
Preview: https://deploy-preview-2128--cert-manager.netlify.app/docs/troubleshooting/certificate-reissuing-loop/
Summary
infinite certificate re-issuance loops
Driver) overwrites a cert-manager-managed Secret, causing a tight
SecretMismatchre-issuance loop with no backoffspec.secretName, causingeach to overwrite the other in a continuous loop
kubectlcommands, and fixes for eachscenario
Two certificates referencing the same secret causes cert manager to constantly modify the secret cert-manager#5675, Race Condition: Cert-Manager Generates Endless Certificate Requests on Openshift cert-manager#6988,
v1.12.X release Infinite loop with 2 certs with different keystore settings cert-manager#6992, cert-manager enters infinite re-issuance loop when Issuer returns invalid certificate cert-manager#8380
Motivation
A customer reported an infinite re-issuance loop caused by External Secrets
Operator overwriting the target Secret. The investigation revealed this is a
known class of problem with no existing documentation. This page fills that gap,
helping users diagnose and resolve the issue without needing to search through
GitHub issues.
Test plan
npm run check:spellingpasses (addedexternalsecretto.spelling)npm run check:markdown(remark) passesnpm run check:eslintpassesnpm run check:stylelintpassesmanifest.jsonvalidates as correct JSONGenerated with Claude (Opus 4.6)